This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. This notebook replicates results in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\); see https://arxiv.org/abs/2312.00530, and produces further Corbit and R-Corbit plots examples.
Simulate realisations of differing length coming from the stationary global-\(\alpha\) GNAR model below
\[ \begin{equation} \label{eq:vector wise representation} \boldsymbol{X}_{t} = \sum_{k = 1}^{p} \left (\alpha_{k} \boldsymbol{X}_{t - k} + \sum_{r = 1}^{s_k} \beta_{kr} \boldsymbol{Z}_{r, t - k} \right ) + \boldsymbol{u}_{t} , \end{equation} \] We use the \(\texttt{fiveNet}\) (included in the CRAN \(\texttt{GNAR}\) package) network throughout this notebook for producing all relevant figures.
We begin by loading the GNAR package into our session in \(\texttt{R}\).
library(GNAR)
source("/Users/danielsalnikov/Documents/PhD/my_stuff/papers/corbit_paper/R_scripts/R_scripts_revised/R-Corbit_Covid.R")
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Loading required package: viridisLite
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:forecast':
##
## gghistogram
We simulate a realisation of length one-hundred by using the function \(\texttt{GNARsim}\). The code for producing the first simulation also shows how to compute the \(r\)-stage adjacency matrices and the weights matrix for the \(\texttt{fiveNet}\) network.
n_sims = 100
# Compute the r-stage adjacency matrices
stages_tensor = get_k_stages_adjacency_tensor(
as.matrix(GNARtoigraph(fiveNet)),
3)
# Compute the weights-matrix for the fiveNet network
W_normalised = weights_matrix(fiveNet, 3)
# First Simulation GNAR(1, [1])
sim1 <- GNARsim(n = n_sims, net=fiveNet,
alphaParams = list(rep(0.47, 5)),
betaParams = list(c(0.46)))
We proceed to compute a Corbit plot for the simulation produced above.
m_lag = 20
m_stage = 5
corbit_plot(vts = sim1, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 1: Corbit plot for a realisation of length 100 coming from a stationary global-alpha GNAR(1, [1]) process.
The Corbit plot above uses the \(\texttt{cividis}\) colour scale, however, users can specify any colour scale supported by the \(\texttt{viridis}\) package (all of which are colour-blind friendly). Note that the Corbit plot can quickly aid analysts for detecting seasonality and trend in an observed realisation of a (network) time series. We incorporate a seasonal term as a cosine wave of period four into simulation one and produce the new Corbit plot.
m_lag = 20
m_stage = 5
sim1_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) }, rep(0, n_sims))
corbit_plot(vts = sim1_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 2: Corbit plot for a realisation of length 100 coming from a seasonal global-alpha GNAR(1, [1]) process.
The Corbit plot in Figure 2 successfully reflects the seasonal period every four lags. We incorporate a linear trend into Simulation 1 below, and analyse the resulting Corbit plot below.
m_lag = 20
m_stage = 5
sim1_trend = vapply(c(1:5), function(x) {sim1[, x] + c(1:n_sims) }, rep(0, n_sims))
corbit_plot(vts = sim1_trend, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.
Finally, we incorporate both a trend and seasonal component into Simulation 1, and analyse the resulting Corbit plot.
m_lag = 20
m_stage = 5
sim1_trend_seasonal = vapply(c(1:5), function(x) {sim1[, x] + 2 * (1 + cos(c(1:n_sims) * pi / 2)) + c(1:n_sims) / 10 }, rep(0, n_sims))
corbit_plot(vts = sim1_trend_seasonal, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'cividis')
Figure 3: Corbit plot for a realisation of length 100 coming from a global-alpha GNAR(1, [1]) process plus a linear trend.
We simulate a realisation of length one-thousand coming from a stationary global-\(\alpha\) GNAR\((3, [2, 1, 0])\) process. Subsequently, we compute the mean value of the NACF and PNACF at each iteration.
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims_2 = 1000
sim2 <- GNARsim(n = nsims_2, net=fiveNet,
alphaParams = list(rep(0.12, 5), rep(0.18, 5)),
betaParams = list(c(0.21), c(-0.32, 0.12)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'turbo', partial = "yes")
We examine another model.
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims = 100
sim2 <- GNARsim(n = nsims, net=fiveNet,
alphaParams = list(rep(0.21, 5), rep(0.18, 5)),
betaParams = list(c(0.12), c(0.32, 0.12)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")
We examine a basic GNAR model
# Simulate a realisation of length 100 coming from a stationary GNAR(2, [1, 2])
nsims = 100
sim2 <- GNARsim(n = nsims, net=fiveNet,
alphaParams = list(rep(0.35, 5), rep(0.12, 5)),
betaParams = list(c(0.24), c(0.17)))
# Compute the corresponding Corbit plot for the PNACF
corbit_plot(vts = sim2, net = fiveNet, max_lag = m_lag, max_stage = 3, viridis_color_option = 'viridis', partial = "yes")
R-Corbit plots allow quick comparison of network correlation structures for different time-slices and/or covariates (e.g., labels). We produce an example comparing three distinct GNAR simulations below.
# Initialise values
m_lag = 20
m_stage = 3
n_sims = 200
set.seed(2023)
# Produce GNAR simulations
sim1 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(c(0.1, 0.12, 0.16, 0.075, 0.21),
c(0.12, 0.14, 0.15, 0.3, 0.22)),
betaParams = list(c(0.16, 0.10), c(0.14, 0.11)))
sim2 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.23, 5), rep(0.11, 5)),
betaParams = list(c(0.21), c(0.12)))
sim3 <- GNARsim(n = n_sims, net=fiveNet, alphaParams = list(rep(0.12, 5), rep(0.10, 5)),
betaParams = list(c(0.25, 0.27), c(0.18)))
# Produce R-Corbot plots
R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes")
R_corbit_plot(list(sim1, sim2, sim3), list(fiveNet), m_lag, m_stage, frame_names = c("GNAR(1, [2, 2]) simulation.", "GNAR(1, [1]) simulation.", "GNAR(1, [2]) simulation."), same_net = "yes", partial = "yes")
We start by loading the NHS trust network time series (included in the \(\texttt{GNAR}\) package). The network time series is a realisation of length four-hundred-and-fifty-two (452) that corresponds to the logarithm of the daily number of patients occupying mechanical ventilation beds, i.e., \(\log (1 + Y_{i, t})\), where \(Y_{i, t}\) is the number of patients occupying mechanical ventilation beds in NHS trust \(i = 1, \dots, 140\) on day \(t = 1, \dots, 452\).
# Preliminary objects
# load the NHS trust network time series
covid_data = logMVbedMVC.vts
# Compute the weight matrix for the NHS trusts network
W_nhs_trust = weights_matrix(NHSTrustMVCAug120.net, 1)
We visualise the evolution of the series by plotting the sum of the total number of patients transferred to mechanical ventilation beds on each day. Further, we compute the mean and standard deviaiton up to day \(t\) for exploratory data analysis.
sum_series = as.ts(rowSums(logMVbedMVC.vts))
plot(c(1:452), sum_series, 'l', xlab = 't', main = expression(paste('Plot of the Sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(Y[t]), col = 'blue')
plot(c(1:452), sum_series / 140, 'l', xlab = 't', main = expression(paste('Plot of daily mean occupancy of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140) / 140)),
ylab = expression(Y[t]), col = 'blue')
# compute mean and standard deviation up top day t
sum_series_mean = vapply(c(1:length(sum_series)), function(x) {mean(sum_series[1:x])}, 0)
sum_series_sd = vapply(c(1:length(sum_series)), function(x) {sd(sum_series[1:x])}, 0)
# plot mean and standard deviation
plot(c(1:452), sum_series_mean, 'l', xlab = 't', main = expression(paste('Plot of the mean sum of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(mean(Y[t])), col = 'blue')
plot(c(1:452), sum_series_sd, 'l', xlab = 't', main = expression(paste('Plot of the sum sd of all NHS time-series: ', Y[t] == sum(X["i, t"], i == 1, 140))),
ylab = expression(sd(Y[t])), col = 'blue')
We analyse the network autocorrelation structure by analysing the Corbit plots below. First, we produce the NACF Corbit plot.
corbit_plot(covid_data, NHSTrustMVCAug120.net, 20, 6)
Subsequently, we analyse the PNACF Corbit plot, which aids model selection and visualising dependence in the data.
corbit_plot(covid_data, NHSTrustMVCAug120.net, 10, 6, partial="yes")
The Corbit plots above suggest that network autocorrelation cuts-off after the first lag, further, that one-stage neighbours appear to have a stronger correlation than higher-order \(r\)-stage neighbours. These Corbit plots suggest comparing the following models.
# Create a table for storing model diagnostic statistics
GNAR_model_comparison_table <- array(dim = c(6, 5))
colnames(GNAR_model_comparison_table) <- c("No. of param. ", " One-Step SE ", " Two-Step SE ", "AIC", "BIC")
We compare GNAR fits for the time-steps selected below.
covid_data_set = covid_data[1:447, ]
test_set = covid_data[448:452, ]
The code for fitting a GNAR\((1, [1])\) is
# GNAR(1, [1])
# model order specification
plag = 1
stages = c(1)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[1, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[1, 4] = AIC(covid_fit)
GNAR_model_comparison_table[1, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[1, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[1, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((1, [6])\) is
# GNAR(1, [6])
# model order specification
plag = 1
stages = c(6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[2, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[2, 4] = AIC(covid_fit)
GNAR_model_comparison_table[2, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[2, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[2, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((2, [6, 6])\) is
# GNAR(2, [6, 6])
# model order specification
plag = 2
stages = c(6, 6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[3, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[3, 4] = AIC(covid_fit)
GNAR_model_comparison_table[3, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[3, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[3, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((6, [6^6])\) is
# GNAR(6, [6, ..., 6])
# model order specification
plag = 6
stages = rep(6, 6)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[4, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[4, 4] = AIC(covid_fit)
GNAR_model_comparison_table[4, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[4, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[4, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((1, [0])\) is
# GNAR(1, [0])
# model order specification
plag = 1
stages = c(0)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[5, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[5, 4] = AIC(covid_fit)
GNAR_model_comparison_table[5, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[5, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[5, 3] = sum((predict(new_fit)- test_set[2, ])^2)
The code for fitting a GNAR\((10, [0])\) is
# GNAR(10, [0])
# model order specification
plag = 10
stages = rep(0, 10)
# fitting the model
covid_fit <- GNARfit(covid_data_set, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
# Number of parameters
GNAR_model_comparison_table[6, 1] = length(covid_fit$mod$coefficients)
GNAR_model_comparison_table[6, 4] = AIC(covid_fit)
GNAR_model_comparison_table[6, 5] = BIC(covid_fit)
# one-step ahead forecast
one_step = predict(covid_fit)
GNAR_model_comparison_table[6, 2] = sum((one_step - test_set[1, ])^2)
# two-step ahead forecast
new_data = rbind(covid_data_set, one_step)
new_fit = GNARfit(new_data, net = NHSTrustMVCAug120.net, alphaOrder = plag, betaOrder = stages)
GNAR_model_comparison_table[6, 3] = sum((predict(new_fit)- test_set[2, ])^2)
We compare the model by printing the following table.
GNAR_model_comparison_table = round(GNAR_model_comparison_table, 3)
rownames(GNAR_model_comparison_table) <- c("GNAR(1, [1])", "GNAR(1, [6])", "GNAR(2, [6, 6])", "GNAR(6, [6,:6])", "GNAR(1, [0])", "GNAR(10, [0])")
print(GNAR_model_comparison_table)
## No. of param. One-Step SE Two-Step SE AIC
## GNAR(1, [1]) 2 5.088 8.808 -452.393
## GNAR(1, [6]) 7 5.064 8.799 -452.532
## GNAR(2, [6, 6]) 14 5.166 8.218 -458.883
## GNAR(6, [6,:6]) 42 5.238 9.882 -467.350
## GNAR(1, [0]) 1 5.030 8.845 -451.739
## GNAR(10, [0]) 10 5.027 8.654 -472.229
## BIC
## GNAR(1, [1]) -452.375
## GNAR(1, [6]) -452.468
## GNAR(2, [6, 6]) -458.755
## GNAR(6, [6,:6]) -466.964
## GNAR(1, [0]) -451.730
## GNAR(10, [0]) -472.137
We compare a GNAR\((1, [1])\) with VAR and sparse VAR models. We begin by loading the required libraries.
library(forecast)
library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
library(sparsevar)
##
## Attaching package: 'sparsevar'
## The following object is masked from 'package:forecast':
##
## accuracy
We perform one-step ahead prediction ten times. The starting time-step and error vectors are initialised below.
# number of one-step ahead predictions
# dummt counter variable
k = 0
t_steps = 10
# starting point
covid_data = logMVbedMVC.vts
initial_start = nrow(covid_data) - t_steps
decimal_digits = 3
# AR model prediction error
ar_pe = rep(0, t_steps)
# global-alpha GNAR model prediction error
gnar_pe = rep(0, t_steps)
# local-alpha GNAR model prediction error
gnar_no_global_pe = rep(0, t_steps)
# VAR prediction error
var_pe = rep(0, t_steps)
# restricted VAR prediction error
res_var_pe = rep(0, t_steps)
# sparse VAR prediction error
sparse_var_pe = rep(0, t_steps)
We perform one-ahead forecasting ten times and save the squared error.
for (i in 1:t_steps) {
# initialise the comparison table
output <- matrix(rep(0, 18), nrow = 6, ncol = 3)
train_end = initial_start + k
test_point = train_end + 1
# We compare the GNAR models in this script
# 140 AR(1) models
arforecast <- apply(covid_data[1:train_end, ], 2, function(x){
forecast(auto.arima(x, d = 0, D = 0, max.p = 1, max.q = 0,
max.P = 0, max.Q = 0, stationary = TRUE, seasonal = FALSE, ic = 'bic',
allowmean = FALSE, allowdrift = FALSE, trace = FALSE), h = 1)$mean
})
output[5, 2] = "$140$"
output[5, 1] = "AR(1)"
output[5, 3] = round(sum((arforecast - covid_data[test_point, ])^2), digits = decimal_digits)
ar_pe[i] = output[5, 3]
# GNAR(1, [1]) global-alpha fit
covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1))
output[1, 2] = 2
output[1, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
gnar_pe[i] = output[1, 3]
output[1, 1] = "GNAR(1, [1])"
# GNAR(1, [1]) local-alpha fit
covid_fit <- GNARfit(covid_data[1:train_end, ], net = NHSTrustMVCAug120.net, alphaOrder = 1, betaOrder = rep(1, 1), globalalpha=FALSE)
output[6, 2] = 141
output[6, 3] = round(sum((predict(covid_fit) - covid_data[test_point, ])^2), digits = decimal_digits)
gnar_no_global_pe[i] = output[6, 3]
output[6, 1] = "GNAR(1, [1])*"
# Restricted VAR(1) model
varforecast <- predict(restrict(VAR(covid_data[1:train_end, ], p = 1, type = 'none')), n.ahead = 1)
get_var_fcst <- function(x){return (x[1])}
var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
output[3, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
res_var_pe[i] = output[3, 3]
active = 0.0
aux = varforecast$model$varresult
for (ii in 1:ncol(covid_data)) {
for (j in 1:length(aux[[ii]][[1]])){
if (abs(aux[[ii]][[1]][j]) > 0.0) {
active = active + 1
}
}
}
output[3, 1] = "Res. VAR(1)"
output[3, 2] = active
# Classic VAR(1) model
varforecast <- predict(VAR(covid_data[1:train_end, ], p = 1, type = 'none'), n.ahead = 1)
get_var_fcst <- function(x){return (x[1])}
var_point_fcst <- unlist(lapply(varforecast$fcst, get_var_fcst))
output[2, 2] = "$140^2$"
output[2, 3] = round(sum((var_point_fcst - covid_data[test_point, ])^2), digits = decimal_digits)
var_pe[i] = output[2, 3]
output[2, 1] = "VAR(1)"
# Sparse VAR(1) model
sparse_var_forecast <- fitVAR(covid_data[1:train_end, ], p = 1)
xhat = sparse_var_forecast$A[[1]] %*% covid_data[train_end, ]
output[4, 3] = round(sum((xhat - covid_data[test_point, ])^2), digits = decimal_digits)
sparse_var_pe[i] = output[4, 3]
b = sparse_var_forecast$A[[1]]
b[abs(b) > 0.0] = 1
output[4, 2] = sum(rowSums(b))
output[4, 1] = "Sparse VAR(1)"
##########################
# stack all ten tables
colnames(output) <- c("Model", "Active Parameters", "One-Step SPE")
print(output)
cat("\n############################################# \n#############################################\n")
if (i == 1) {
out_data = output
} else {
out_data = rbind(out_data, output)
}
k = k + 1
}
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "5.441"
## [2,] "VAR(1)" "$140^2$" "11.613"
## [3,] "Res. VAR(1)" "3821" "9.902"
## [4,] "Sparse VAR(1)" "2759" "10.109"
## [5,] "AR(1)" "$140$" "82.055"
## [6,] "GNAR(1, [1])*" "141" "5.324"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "7.86"
## [2,] "VAR(1)" "$140^2$" "12.804"
## [3,] "Res. VAR(1)" "3712" "11.504"
## [4,] "Sparse VAR(1)" "2992" "11.533"
## [5,] "AR(1)" "$140$" "78.748"
## [6,] "GNAR(1, [1])*" "141" "7.833"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "10.286"
## [2,] "VAR(1)" "$140^2$" "12.52"
## [3,] "Res. VAR(1)" "3730" "11.605"
## [4,] "Sparse VAR(1)" "3033" "13.008"
## [5,] "AR(1)" "$140$" "82.571"
## [6,] "GNAR(1, [1])*" "141" "10.173"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "12.953"
## [2,] "VAR(1)" "$140^2$" "17.762"
## [3,] "Res. VAR(1)" "3812" "14.472"
## [4,] "Sparse VAR(1)" "3283" "15.057"
## [5,] "AR(1)" "$140$" "86.781"
## [6,] "GNAR(1, [1])*" "141" "12.676"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "7.525"
## [2,] "VAR(1)" "$140^2$" "13.82"
## [3,] "Res. VAR(1)" "3761" "10.815"
## [4,] "Sparse VAR(1)" "3315" "11.995"
## [5,] "AR(1)" "$140$" "89.097"
## [6,] "GNAR(1, [1])*" "141" "7.431"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "5.088"
## [2,] "VAR(1)" "$140^2$" "12.504"
## [3,] "Res. VAR(1)" "3830" "9.976"
## [4,] "Sparse VAR(1)" "3284" "9.863"
## [5,] "AR(1)" "$140$" "88.181"
## [6,] "GNAR(1, [1])*" "141" "4.989"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "4.44"
## [2,] "VAR(1)" "$140^2$" "7.978"
## [3,] "Res. VAR(1)" "3738" "7.716"
## [4,] "Sparse VAR(1)" "3030" "7.918"
## [5,] "AR(1)" "$140$" "91.004"
## [6,] "GNAR(1, [1])*" "141" "4.466"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "3.061"
## [2,] "VAR(1)" "$140^2$" "7.196"
## [3,] "Res. VAR(1)" "3759" "5.888"
## [4,] "Sparse VAR(1)" "3017" "6.842"
## [5,] "AR(1)" "$140$" "88.066"
## [6,] "GNAR(1, [1])*" "141" "3.089"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "8.868"
## [2,] "VAR(1)" "$140^2$" "10.869"
## [3,] "Res. VAR(1)" "3772" "10.999"
## [4,] "Sparse VAR(1)" "3298" "11.773"
## [5,] "AR(1)" "$140$" "91.151"
## [6,] "GNAR(1, [1])*" "141" "8.607"
##
## #############################################
## #############################################
## Model Active Parameters One-Step SPE
## [1,] "GNAR(1, [1])" "2" "8.623"
## [2,] "VAR(1)" "$140^2$" "15.693"
## [3,] "Res. VAR(1)" "3794" "13.258"
## [4,] "Sparse VAR(1)" "3299" "15.715"
## [5,] "AR(1)" "$140$" "96.361"
## [6,] "GNAR(1, [1])*" "141" "8.546"
##
## #############################################
## #############################################
comp_results = cbind(as.numeric(gnar_pe), as.numeric(sparse_var_pe), as.numeric(res_var_pe), as.numeric(var_pe), as.numeric(ar_pe), as.numeric(gnar_no_global_pe))
print(colMeans(comp_results))
## [1] 7.4145 11.3813 10.6135 12.2759 87.4015 7.3134
aux = out_data[out_data[, 1] == 'Sparse VAR(1)']
print(mean(as.numeric(aux[11:20])))
## [1] 3131
# 3089
aux = out_data[out_data[, 1] == 'Res. VAR(1)']
print(mean(as.numeric(aux[11:20])))
## [1] 3772.9
# 3773
all_summary = matrix(rep(0, 18), nrow = 6, ncol = 3)
all_summary[, 1] = colMeans(comp_results)
all_summary[, 2] = c(2, 2962, 3767, 19600, 140, 141)
for (j in 1:ncol(comp_results)) {
all_summary[j, 3] = sd(comp_results[, j])
}
print(all_summary)
## [,1] [,2] [,3]
## [1,] 7.4145 2 2.977923
## [2,] 11.3813 2962 2.828714
## [3,] 10.6135 3767 2.482946
## [4,] 12.2759 19600 3.184071
## [5,] 87.4015 140 5.146953
## [6,] 7.3134 141 2.900672
For completeness we also compare GNAR with a conditional autoregressive temporal autoregresive (CARar) model. Interestingly, both models perform similarly. It turns out that CARar models can be seen as GNAR models constrained to one-stage neighbourhood regressions. Computation for CARar requires expensice MCMC calls, thus, for brevity we summarise the results below; see for the experiments code.
$ We also remark the difference in execution time (in seconds):$
We analyse the impact individual nodes have on network autocorrelation for our choice of model order. Further, we study the relative importance each node has in pair-wise node regressions, and compare the one-lag cross-correlation matrix with a heat map obtained from Theorem 2, this comparison shows the self-similar clustering behaviour for pair-wise correlations.
We start by defining the different waves of the pandemic and looking at a R-Corbit plot of the realised network autocorrelation.
covid_data = logMVbedMVC.vts
W_nhs_network = weights_matrix(NHSTrustMVCAug120.net, 6)
first_wave = covid_data[1:100, ]
gap = covid_data[101:177, ]
second_wave = covid_data[178:447, ]
time_slices = c('First Wave: 04-2020 -> 07-2020', 'Gap: 07-2020 -> 09-2020', 'Second Wave: 09-2020 -> 07-2021')
R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes',
frame_names = time_slices)
The PNACF R-Corbit plot is produced with the code below.
R_corbit_plot(list(first_wave, gap, second_wave), list(NHSTrustMVCAug120.net), 10, 6, weight_matrices = list(W_nhs_network), same_net = 'yes',
frame_names = time_slices, partial = 'yes')
We explore the importance individual nodes have for predicting each other with the heat-map below. Note that it is not symmetric, the yellow colour highlights the strongest pair-wise realations according to formula (18) in \(\textbf{New tools for network time series with an application to COVID-19 hospitalisations}\).
active_node_plot(second_wave, NHSTrustMVCAug120.net, 1, c(1))
We continue by looking at the impact each node has on the total network autocorrelation (i.e., the column in \(\mathbf{W}\) that dominates autocorrelation). We do this by producing the node relevance hierarchy and plot, which is computed using (17).
node_relevance_plot(NHSTrustMVCAug120.net, 1, colnames(covid_data), 1)
## [[1]]
##
## [[2]]
## Node Relevance
## 1 RGP 0.2388075
## 2 REF 0.2590320
## 3 RNN 0.3423592
## 4 RM1 0.3481702
## 5 RJL 0.3482829
## 6 RWA 0.4361727
## 7 RVV 0.4489792
## 8 RXC 0.4656740
## 9 RLQ 0.4718158
## 10 RCX 0.4981928
## 11 R1F 0.5329288
## 12 RTF 0.5451035
## 13 RTD 0.5607670
## 14 RWD 0.5672991
## 15 RXH 0.5723261
## 16 R0B 0.5776501
## 17 RR7 0.5914945
## 18 RBD 0.6028521
## 19 RWP 0.6045146
## 20 R0D 0.6237309
## 21 RDY 0.6300686
## 22 RBL 0.6343268
## 23 RVY 0.6360888
## 24 RYR 0.6388915
## 25 RBZ 0.6407169
## 26 REN 0.6555618
## 27 RXW 0.6644188
## 28 RTQ 0.6657547
## 29 RXL 0.6672037
## 30 RD1 0.6738285
## 31 RAJ 0.6772388
## 32 RA7 0.6775857
## 33 REM 0.6792473
## 34 RNZ 0.6839019
## 35 RQ3 0.6840216
## 36 RRK 0.6843320
## 37 RET 0.6886182
## 38 RRJ 0.6945536
## 39 RVJ 0.6959880
## 40 RKB 0.6985588
## 41 RHU 0.7060903
## 42 RJR 0.7124531
## 43 RWF 0.7136386
## 44 RH8 0.7216425
## 45 RLT 0.7233334
## 46 RBS 0.7266070
## 47 RJC 0.7274836
## 48 RBQ 0.7277001
## 49 RWE 0.7277560
## 50 RXN 0.7293355
## 51 RDE 0.7317239
## 52 RGN 0.7323235
## 53 RNA 0.7341533
## 54 RPA 0.7402341
## 55 RTR 0.7405527
## 56 RK9 0.7428719
## 57 RXK 0.7429187
## 58 RVW 0.7525139
## 59 RGR 0.7581291
## 60 RBN 0.7648500
## 61 RP5 0.7679661
## 62 RL4 0.7741657
## 63 RHM 0.7741876
## 64 RTX 0.7802541
## 65 RXE 0.7804340
## 66 RFR 0.7804340
## 67 RRF 0.7837088
## 68 RA9 0.7865553
## 69 RH5 0.7924808
## 70 RN7 0.7947044
## 71 RNQ 0.7962871
## 72 RP1 0.7974614
## 73 RBK 0.7993277
## 74 RXP 0.8151463
## 75 RF4 0.8178328
## 76 RTP 0.8213661
## 77 RA4 0.8273467
## 78 RXR 0.8290119
## 79 RJ2 0.8296417
## 80 RJ6 0.8388167
## 81 RCB 0.8419626
## 82 RMC 0.8420556
## 83 RWW 0.8421183
## 84 R1H 0.8458930
## 85 RJ1 0.8477358
## 86 RQX 0.8503438
## 87 RAP 0.8514368
## 88 RP4 0.8529093
## 89 RJZ 0.8534630
## 90 RRV 0.8548508
## 91 RTK 0.8587055
## 92 RJ7 0.8592502
## 93 RKE 0.8602540
## 94 RAL 0.8605143
## 95 RK5 0.8607855
## 96 RNS 0.8640082
## 97 RX1 0.8650822
## 98 RVR 0.8651983
## 99 RPY 0.8655824
## 100 RQM 0.8663747
## 101 RYJ 0.8680895
## 102 RWH 0.8682384
## 103 RT1 0.8695644
## 104 RR8 0.8723391
## 105 RGT 0.8759949
## 106 RGM 0.8763338
## 107 RAX 0.8783572
## 108 RC9 0.8839073
## 109 RXF 0.8857286
## 110 RA2 0.8887197
## 111 RAN 0.8925245
## 112 RM3 0.8945031
## 113 RQW 0.8950113
## 114 R1K 0.8981421
## 115 RW6 0.8999965
## 116 RAE 0.9023903
## 117 RAS 0.9059743
## 118 RDU 0.9060968
## 119 RTG 0.9105164
## 120 RWG 0.9174765
## 121 RN5 0.9209021
## 122 RHQ 0.9258074
## 123 RBT 0.9285356
## 124 RD8 0.9324912
## 125 RWY 0.9352884
## 126 RCF 0.9354989
## 127 RFF 0.9416439
## 128 RCU 0.9437173
## 129 R0A 0.9490081
## 130 RFS 0.9523243
## 131 RTH 0.9524462
## 132 RNU 0.9609052
## 133 RJE 0.9688577
## 134 RN3 0.9700914
## 135 RMP 0.9724912
## 136 RCD 0.9729392
## 137 RHW 0.9736950
## 138 RWJ 0.9790502
## 139 RJN 0.9880315
## 140 RXQ 1.0000000
We compare the observed one-lag cross-correlation and the structure Theorem 2 suggests in the plots below. Note that the neighbourhood and self-similarity suggests that outbreaks are mostly dependent on direct neighbours, i.e., the virus cannot reach nodes by jumps it must traverse all NHS trusts (i.e., nodes) in a path between two places for instigating an outbreak.
# larger r-stage changes the correlation structure
max_r_stage = 1
local_relevance_plot(NHSTrustMVCAug120.net, max_r_stage)
cross_correlation_plot(1, covid_data)
All three plots, except for the node relevance one, have the following column and row names.
dummy_vec = vapply(c(1:140), function(x) {cat(paste0(x, ": ", colnames(covid_data)[x]), "\n"); return (0)}, 0)
## 1: RCF
## 2: RBS
## 3: RTK
## 4: RF4
## 5: RFF
## 6: R1H
## 7: RC9
## 8: RQ3
## 9: RXL
## 10: RMC
## 11: RAE
## 12: RXH
## 13: RXQ
## 14: RWY
## 15: RGT
## 16: RT1
## 17: RQM
## 18: RFS
## 19: RJR
## 20: RXP
## 21: RJ6
## 22: RN7
## 23: RP5
## 24: RBD
## 25: RDY
## 26: RJN
## 27: RVV
## 28: RXR
## 29: RDE
## 30: RXC
## 31: RWH
## 32: RVR
## 33: RDU
## 34: RR7
## 35: RLT
## 36: RTQ
## 37: RP4
## 38: RN3
## 39: RJ1
## 40: RN5
## 41: RCD
## 42: RQX
## 43: RWA
## 44: RYJ
## 45: R1F
## 46: RGP
## 47: RNQ
## 48: RJZ
## 49: RAX
## 50: RXN
## 51: RR8
## 52: RJ2
## 53: RBQ
## 54: REM
## 55: R1K
## 56: RWF
## 57: R0A
## 58: RPA
## 59: RBT
## 60: RXF
## 61: RAJ
## 62: RD8
## 63: RM1
## 64: RVJ
## 65: RNN
## 66: RAP
## 67: RVW
## 68: RGN
## 69: RNS
## 70: RP1
## 71: RBZ
## 72: RJL
## 73: RTF
## 74: RX1
## 75: RNU
## 76: RTH
## 77: RW6
## 78: RHU
## 79: RXE
## 80: RHW
## 81: REF
## 82: RH8
## 83: RAL
## 84: RAN
## 85: RGM
## 86: RA2
## 87: RD1
## 88: RM3
## 89: RNZ
## 90: RXK
## 91: RCU
## 92: RHQ
## 93: RK5
## 94: RH5
## 95: RTR
## 96: R0B
## 97: RJC
## 98: RVY
## 99: RJ7
## 100: RBN
## 101: RWJ
## 102: RTP
## 103: RMP
## 104: REN
## 105: RNA
## 106: RAS
## 107: RTD
## 108: RQW
## 109: RCX
## 110: RFR
## 111: RPY
## 112: RRJ
## 113: RL4
## 114: RXW
## 115: RET
## 116: RA9
## 117: RWD
## 118: RRV
## 119: RHM
## 120: RRK
## 121: RA7
## 122: RKB
## 123: R0D
## 124: RK9
## 125: RTG
## 126: RWE
## 127: RTX
## 128: RJE
## 129: RBK
## 130: RWW
## 131: RWG
## 132: RGR
## 133: RYR
## 134: RKE
## 135: RBL
## 136: RWP
## 137: RRF
## 138: RLQ
## 139: RA4
## 140: RCB